Installing and importing libraries¶
In [1]:
!pip install geopy
Requirement already satisfied: geopy in c:\users\brahi\anaconda3\lib\site-packages (2.4.1) Requirement already satisfied: geographiclib<3,>=1.52 in c:\users\brahi\anaconda3\lib\site-packages (from geopy) (2.0)
In [2]:
import pandas as pd
import folium
from folium import plugins
import seaborn as sns
import plotly.graph_objects as go
import plotly.express as px
import matplotlib.pyplot as plt
import plotly.io as pio
pio.renderers.default = "notebook"
Data exploration¶
In [4]:
data= pd.read_csv('hop.csv')
In [5]:
data
Out[5]:
| Liste nominative des établissements hospitaliers par catégorie 2022 | Unnamed: 1 | Unnamed: 2 | Unnamed: 3 | Unnamed: 4 | Unnamed: 5 | Unnamed: 6 | Unnamed: 7 | |
|---|---|---|---|---|---|---|---|---|
| 0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 1 | Région | Delegation | Commune | Etablissement hospitalier | Catégorie | NaN | Liste des abréviations | NaN |
| 2 | Tanger-Tetouan-Al Hoceima | Al Hoceima | Al Hoceima (Mun.) | Mohamed V | HP | NaN | HP | Hôpital Provincial/Préfectoral |
| 3 | Tanger-Tetouan-Al Hoceima | Al Hoceima | Al Hoceima (Mun.) | C. d'oncologie d'Al Hoceima | CRO | NaN | HR | Hôpital Régional |
| 4 | Tanger-Tetouan-Al Hoceima | Al Hoceima | Imzouren (Mun.) | Imzouren | HPr | NaN | HIR | Hospital Interrégional |
| ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 167 | Laayoune-Sakia El Hamra | Es Semara | Es-semara (Mun.) | Es-Smara | HP | NaN | NaN | NaN |
| 168 | Laayoune-Sakia El Hamra | Laayoune | Laayoune (Mun.) | My Hassan Ben El Mehdi | HR | NaN | NaN | NaN |
| 169 | Laayoune-Sakia El Hamra | Laayoune | Laayoune (Mun.) | Hassan II | HR | NaN | NaN | NaN |
| 170 | Laayoune-Sakia El Hamra | Laayoune | Laayoune (Mun.) | Laayoune | CRO | NaN | NaN | NaN |
| 171 | Eddakhla-Oued Eddahab | Oued Ed-Dahab | Dakhla (Mun.) | Hassan II | HR | NaN | NaN | NaN |
172 rows × 8 columns
In [7]:
data.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 172 entries, 0 to 171 Data columns (total 8 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Liste nominative des établissements hospitaliers par catégorie 2022 171 non-null object 1 Unnamed: 1 171 non-null object 2 Unnamed: 2 171 non-null object 3 Unnamed: 3 171 non-null object 4 Unnamed: 4 171 non-null object 5 Unnamed: 5 0 non-null float64 6 Unnamed: 6 9 non-null object 7 Unnamed: 7 8 non-null object dtypes: float64(1), object(7) memory usage: 10.9+ KB
observation : We have 172 row and 8 columns with Unnamed headers¶
In [9]:
data.isna().sum()
Out[9]:
Liste nominative des établissements hospitaliers par catégorie 2022 1 Unnamed: 1 1 Unnamed: 2 1 Unnamed: 3 1 Unnamed: 4 1 Unnamed: 5 172 Unnamed: 6 163 Unnamed: 7 164 dtype: int64
observation : out of the 8 columns, one is completely null, and the first row is completely null as well¶
Data cleaning and feature engineering¶
#### I droped some columns due to these reasons :
* Unnamed 5 : because it's entirely empty
* Unnamed 6 : it doesn't add any value
* Unnamed 7 : it plays an important role, but it needs to be mapped properly with the categories in (Unnamed: 4) so i leave it for latter
#### Mapping process : I created a dict that contains categories abbreviations and their meaning, than I used map() function
In [13]:
data = data.drop(['Unnamed: 5','Unnamed: 6','Unnamed: 7'],axis=1)
In [14]:
columns_names=list(data.iloc[1])
data.columns = columns_names
hopital = data.drop([0,1])
In [15]:
categories_mapping = {'HP':'Hôpital Provincial/Préfectoral', 'CRO' : "Centre Régional d'Oncologie", 'HPr':'Hôpital de Proximité',
'HPsyR':'Hôpital Psychiatrique Régional', 'HIR':'Hospital Interrégional', 'CPU':'Centre Psychiatrique Universitaire'
,'HPsyP':'Hôpital Psychiatrique Provincial/préfectoral', 'HR':'Hôpital Régional'}
hopital['catégorie signification'] = hopital['Catégorie'].map(categories_mapping)
In [16]:
hopital.isna().sum() # recheck for null values
Out[16]:
Région 0 Delegation 0 Commune 0 Etablissement hospitalier 0 Catégorie 0 catégorie signification 0 dtype: int64
#### I created a new column called 'Type Administratif' which show whether the commune is 'Municipalité' or 'arrondissement'
Values in the Commune column are in this format : Al Hoceima (Arrond.)/ Al Hoceima (Mun.) where (Arrond.) stands for arrondissement and (Mun.) stands for Municipalité so i need to extract these two categories and store them in a new column.
In [18]:
#creating Type Administratif column (urban/rural)
hopital['Type Administratif'] = hopital['Commune'].str.extract(r'\((.*?)\)')[0]
hopital['Type Administratif'] = hopital['Type Administratif'].replace(['Mun.', 'Arrond.'], ['Municipalité', 'Arrondissement'])
In [19]:
# removing the administartiv type from 'commune' column
hopital['Commune'] = hopital['Commune'].str.replace(r' \(Mun\.\)', '', regex=True)
hopital['Commune'] = hopital['Commune'].str.replace(r' \(Arrond\.\)', '', regex=True)
In [20]:
hopital
Out[20]:
| Région | Delegation | Commune | Etablissement hospitalier | Catégorie | catégorie signification | Type Administratif | |
|---|---|---|---|---|---|---|---|
| 2 | Tanger-Tetouan-Al Hoceima | Al Hoceima | Al Hoceima | Mohamed V | HP | Hôpital Provincial/Préfectoral | Municipalité |
| 3 | Tanger-Tetouan-Al Hoceima | Al Hoceima | Al Hoceima | C. d'oncologie d'Al Hoceima | CRO | Centre Régional d'Oncologie | Municipalité |
| 4 | Tanger-Tetouan-Al Hoceima | Al Hoceima | Imzouren | Imzouren | HPr | Hôpital de Proximité | Municipalité |
| 5 | Tanger-Tetouan-Al Hoceima | Al Hoceima | Targuist | Targuist | HPr | Hôpital de Proximité | Municipalité |
| 6 | Tanger-Tetouan-Al Hoceima | Chefchaouen | Chefchaouen | Mohamed V | HP | Hôpital Provincial/Préfectoral | Municipalité |
| ... | ... | ... | ... | ... | ... | ... | ... |
| 167 | Laayoune-Sakia El Hamra | Es Semara | Es-semara | Es-Smara | HP | Hôpital Provincial/Préfectoral | Municipalité |
| 168 | Laayoune-Sakia El Hamra | Laayoune | Laayoune | My Hassan Ben El Mehdi | HR | Hôpital Régional | Municipalité |
| 169 | Laayoune-Sakia El Hamra | Laayoune | Laayoune | Hassan II | HR | Hôpital Régional | Municipalité |
| 170 | Laayoune-Sakia El Hamra | Laayoune | Laayoune | Laayoune | CRO | Centre Régional d'Oncologie | Municipalité |
| 171 | Eddakhla-Oued Eddahab | Oued Ed-Dahab | Dakhla | Hassan II | HR | Hôpital Régional | Municipalité |
170 rows × 7 columns
In [ ]:
In [22]:
hopital_region=pd.DataFrame(hopital.groupby('Région')['Etablissement hospitalier'].count())
hopital_region
Out[22]:
| Etablissement hospitalier | |
|---|---|
| Région | |
| Béni Mellal-Khénifra | 12 |
| Casablanca-Settat | 27 |
| Drâa-Tafilalet | 11 |
| Eddakhla-Oued Eddahab | 1 |
| Fès-Meknès | 21 |
| Guelmim-Oued Noun | 5 |
| Laayoune-Sakia El Hamra | 5 |
| Marrakech-Safi | 20 |
| Oriental | 16 |
| Rabat-Salé-Kénitra | 20 |
| Souss-Massa | 9 |
| Tanger-Tetouan-Al Hoceima | 23 |
Visualisation¶
#### I created a table with the number of hospitals, number of population, death and births in 2022 grouped by region
The purpose of this table :
- Compare the distribution of hospitals in the 12 regions.
- Compare the effective population across the 12 regions.
- Investigating the correlation between effective population size and the number of hospitals.
- Investigating the correlation between effective population size and the number of deaths.
- Investigating the correlation between effective population size and the number of births.
Note: The population column contains estimated values derived from multiple sources and may not accurately represent the population in 2022.¶
##### Calculations for Deaths and Births
Estimated Deaths (2022):
We used a crude death rate of 6.6 deaths per 1,000 people for Morocco in 2022.
For each region, we calculated:
Total Deaths = (Population * 6.6) / 1,000
Example: For Tanger-Tétouan-Al Hoceïma (3,651,427 people), deaths = (3,651,427 * 6.6) / 1,000 = 24,099.Estimated Births (2022):
We used a crude birth rate of 17.16 births per 1,000 people for Morocco in 2022.
For each region, we calculated:
Total Births = (Population * 17.16) / 1,000
Example: For Tanger-Tétouan-Al Hoceïma (3,651,427 people), births = (3,651,427 * 17.16) / 1,000 = 62,677.
In [39]:
import pandas as pd
# Your population dictionary
population_2022 = {
"Tanger-Tétouan-Al Hoceïma": 3651427,
"Lâayoune-Sakia El Hamra": 370000,
"Oued Ed-Dahkla-Oued Eddahab": 130000,
"Casablanca-Settat": 6950000,
"Rabat-Salé-Kénitra": 4850000,
"Fès-Meknès": 4150000,
"Marrakech-Safi": 4550000,
"Drâa-Tafilalet": 1650000,
"Souss-Massa": 2900000,
"Guelmim-Oued Noun": 430000,
"Béni Mellal-Khénifra": 3550000,
"Oriental": 2450000
}
hopital_region.index = hopital_region.index.str.replace('Tanger-Tetouan-Al Hoceima', 'Tanger-Tétouan-Al Hoceïma')
hopital_region.index = hopital_region.index.str.replace('Laayoune-Sakia El Hamra', 'Lâayoune-Sakia El Hamra')
hopital_region.index = hopital_region.index.str.replace('Eddakhla-Oued Eddahab', 'Oued Ed-Dahkla-Oued Eddahab')
hopital_region['Population (approximativement)'] = hopital_region.index.map(population_2022)
hopital_region['Estimated Deaths (2022)'] = (hopital_region['Population (approximativement)'] * 6.6) / 1000
hopital_region['Estimated Births (2022)'] = (hopital_region['Population (approximativement)'] * 17.16) / 1000
display(hopital_region)
| Etablissement hospitalier | Population (approximativement) | Estimated Deaths (2022) | Estimated Births (2022) | |
|---|---|---|---|---|
| Région | ||||
| Béni Mellal-Khénifra | 12 | 3550000 | 23430.0000 | 60918.00000 |
| Casablanca-Settat | 27 | 6950000 | 45870.0000 | 119262.00000 |
| Drâa-Tafilalet | 11 | 1650000 | 10890.0000 | 28314.00000 |
| Oued Ed-Dahkla-Oued Eddahab | 1 | 130000 | 858.0000 | 2230.80000 |
| Fès-Meknès | 21 | 4150000 | 27390.0000 | 71214.00000 |
| Guelmim-Oued Noun | 5 | 430000 | 2838.0000 | 7378.80000 |
| Lâayoune-Sakia El Hamra | 5 | 370000 | 2442.0000 | 6349.20000 |
| Marrakech-Safi | 20 | 4550000 | 30030.0000 | 78078.00000 |
| Oriental | 16 | 2450000 | 16170.0000 | 42042.00000 |
| Rabat-Salé-Kénitra | 20 | 4850000 | 32010.0000 | 83226.00000 |
| Souss-Massa | 9 | 2900000 | 19140.0000 | 49764.00000 |
| Tanger-Tétouan-Al Hoceïma | 23 | 3651427 | 24099.4182 | 62658.48732 |
In [41]:
fig = go.Figure(go.Bar(
x=hopital_region.index,
y=hopital_region['Etablissement hospitalier'],
text=hopital_region['Etablissement hospitalier'],
textposition='auto', # Automatically position text on bars
marker_color=['#4B0081', '#6A0DAD', '#8A2BE2', '#9400D3'] # Darker color palette
))
fig.update_layout(yaxis_title = "Nombre d'Hopitaux", xaxis_title='Région', title="Etablissement hospitalier par Région (Approximativement)")
In [43]:
fig_pop = go.Figure(go.Bar(
x=hopital_region.index,
y=hopital_region['Population (approximativement)'],
text=hopital_region['Population (approximativement)'],
textposition='auto', # Automatically position text on bars,
marker_color=['#4B0081', '#6A0DAD', '#8A2BE2', '#9400D3'] # Darker color palette
))
fig_pop.update_layout(yaxis_title = "Population", xaxis_title='Région', title="Population par Région (Approximativement)")
In [45]:
deaths_box = px.box(hopital_region, x ='Estimated Deaths (2022)', title='Distribution des Décès| Estimées (2022)')
deaths_box
In [46]:
births_box = px.box(hopital_region, x ='Estimated Births (2022)', title= 'Distribution des Naissances Estimées (2022)')
births_box
In [48]:
fig_region_births_deaths = px.bar(hopital_region, x=hopital_region.index, y =list(hopital_region[['Estimated Births (2022)','Estimated Deaths (2022)']]),
title="Naissances et Décès Estimés par Région (2022)")
fig_region_births_deaths
In [51]:
correlation_pop_hop= px.scatter(hopital_region, y='Etablissement hospitalier', x='Population (approximativement)', trendline='ols', # Ordinary Least Squares regression
title="Correlation entre le nombre d'Établissements Hospitaliers et Population")
correlation_pop_hop
In [52]:
correlation_pop_death= px.scatter(hopital_region, y='Estimated Deaths (2022)', x='Population (approximativement)', trendline='ols', # Ordinary Least Squares regression
title="Correlation entre les Décès hospitalier et la Population")
correlation_pop_death
In [53]:
correlation_pop_births= px.scatter(hopital_region, y='Estimated Births (2022)', x='Population (approximativement)', trendline='ols', # Ordinary Least Squares regression
title="Correlation entre les Naissances et la Population")
correlation_pop_births
In [54]:
correlation_hop_births= px.scatter(hopital_region, y='Estimated Births (2022)', x='Etablissement hospitalier', trendline='ols', # Ordinary Least Squares regression
title="Corrélation entre le Nombre d'Établissements Hospitaliers et les Naissances")
correlation_hop_births
In [58]:
correlation_hop_deaths= px.scatter(hopital_region, y='Estimated Deaths (2022)', x='Etablissement hospitalier', trendline='ols', # Ordinary Least Squares regression
title="Corrélation entre le Nombre d'Établissements Hospitaliers et le Nombre de Décès")
correlation_hop_deaths
In [61]:
correlation_matrix = hopital_region.corr()
plt.figure(figsize=(8, 6))
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', fmt='.2f')
plt.title('Correlation Matrix')
plt.show()
#### I created a table grouped by both 'Type Administratif' and 'catégorie signification' columns with the count the as the third column.
#### Count : shows the count of categories in each Type of Administration
#### Purpose of this table :
* Compare the distribution of categories across each type of administration (Arrondissement and Municipalité) using a stacked bar chart.
In [64]:
#Group the data by 'Type Administratif' and 'Catégorie' and count the occurrences
rurale_urbaine = hopital.groupby(['Type Administratif', 'catégorie signification']).size().reset_index(name='Count')
rurale_urbaine
Out[64]:
| Type Administratif | catégorie signification | Count | |
|---|---|---|---|
| 0 | Arrondissement | Centre Psychiatrique Universitaire | 3 |
| 1 | Arrondissement | Hospital Interrégional | 20 |
| 2 | Arrondissement | Hôpital Provincial/Préfectoral | 9 |
| 3 | Arrondissement | Hôpital Psychiatrique Régional | 1 |
| 4 | Arrondissement | Hôpital Régional | 10 |
| 5 | Arrondissement | Hôpital de Proximité | 4 |
| 6 | Municipalité | Centre Psychiatrique Universitaire | 2 |
| 7 | Municipalité | Centre Régional d'Oncologie | 4 |
| 8 | Municipalité | Hospital Interrégional | 4 |
| 9 | Municipalité | Hôpital Provincial/Préfectoral | 63 |
| 10 | Municipalité | Hôpital Psychiatrique Provincial/préfectoral | 3 |
| 11 | Municipalité | Hôpital Régional | 8 |
| 12 | Municipalité | Hôpital de Proximité | 33 |
In [66]:
fig_cat_adm = px.bar(rurale_urbaine, x='Type Administratif', y ='Count',color='catégorie signification' )
fig_cat_adm
fig_cat_adm
#### This map helps you find hospitals across Morocco, showing their region and if they’re in a rural or urban areas.
- Markers: Colors show the region (like red for Marrakech-Safi).
- Explore: Zoom in or out, and click a marker for more hospital info.
- Legend: Look at the bottom-left to decode colors and icons.
In [648]:
from geopy.geocoders import Nominatim
from geopy.extra.rate_limiter import RateLimiter
import folium
import pandas as pd
import numpy as np
import random
import time
# Assume 'hopital' is your DataFrame; replace with your actual data loading if needed
# Example: hopital = pd.read_csv("hospitals.csv")
# Step 1: Set up the geocoder with rate limiting and retries
geolocator = Nominatim(user_agent="hospital_map", timeout=10)
geocode = RateLimiter(geolocator.geocode, min_delay_seconds=1, max_retries=3, error_wait_seconds=2)
# Step 2: Define a color palette for regions
region_colors = {
"Tanger-Tétouan-Al Hoceïma": "blue",
"Lâayoune-Sakia El Hamra": "red",
"Oued Ed-Dahkla-Oued Eddahab": "green",
"Casablanca-Settat": "purple",
"Rabat-Salé-Kénitra": "orange",
"Fès-Meknès": "pink",
"Marrakech-Safi": "darkgreen",
"Drâa-Tafilalet": "darkblue",
"Souss-Massa": "darkred",
"Guelmim-Oued Noun": "lightblue",
"Béni Mellal-Khénifra": "lightgreen",
"Oriental": "gray"
}
# Step 3: Geocode communes with a fallback to region coordinates
commune_coords = {}
country = "Morocco"
failed_communes = []
for commune in hopital["Commune"].unique():
clean_commune = commune.replace(" (Arrond.)", "").replace(" (Mun.)", "")
try:
location = geocode(f"{clean_commune}, {country}")
if location:
commune_coords[commune] = (location.latitude, location.longitude)
else:
print(f"Couldn’t find coordinates for {clean_commune}, {country}")
failed_communes.append(commune)
except Exception as e:
print(f"Error with {clean_commune}: {e}")
failed_communes.append(commune)
# Fallback: Use region coordinates for failed communes
for commune in failed_communes:
region = hopital[hopital["Commune"] == commune]["Région"].iloc[0]
try:
location = geocode(f"{region}, {country}")
if location:
# Add a larger offset to approximate commune location within region
offset_lat = random.uniform(-0.1, 0.1) # ~10 km
offset_lon = random.uniform(-0.1, 0.1)
commune_coords[commune] = (location.latitude + offset_lat, location.longitude + offset_lon)
print(f"Using region fallback for {commune}: {commune_coords[commune]}")
else:
print(f"Couldn’t geocode region {region} for {commune}")
except Exception as e:
print(f"Error geocoding region {region} for {commune}: {e}")
# Step 4: Assign coordinates to each hospital with unique keys
hospital_coords = {}
hospital_groups = hopital.groupby("Commune")
for commune, group in hospital_groups:
if commune in commune_coords:
base_coord = commune_coords[commune]
num_hospitals = len(group)
if num_hospitals == 1:
unique_key = f"{group.iloc[0]['Etablissement hospitalier']}_{commune}"
hospital_coords[unique_key] = base_coord
else:
for _, row in group.iterrows():
offset_lat = random.uniform(-0.001, 0.001) # ~100 meters
offset_lon = random.uniform(-0.001, 0.001)
unique_key = f"{row['Etablissement hospitalier']}_{commune}"
hospital_coords[unique_key] = (
base_coord[0] + offset_lat,
base_coord[1] + offset_lon
)
else:
print(f"Commune {commune} still not geocoded after fallback.")
# Step 5: Create the map
if hospital_coords:
latitudes = [coord[0] for coord in hospital_coords.values()]
longitudes = [coord[1] for coord in hospital_coords.values()]
mean_lat = np.mean(latitudes)
mean_lon = np.mean(longitudes)
m = folium.Map(location=[mean_lat, mean_lon], zoom_start=6)
# Step 6: Plot each hospital with region color, admin type icon, and category in popup
for _, row in hopital.iterrows():
hospital_name = row["Etablissement hospitalier"]
commune = row["Commune"]
unique_key = f"{hospital_name}_{commune}"
if unique_key in hospital_coords:
coord = hospital_coords[unique_key]
region = row["Région"]
category = row["Catégorie"]
signification = row["catégorie signification"]
admin_type = row["Type Administratif"]
# Assign color based on region
color = region_colors.get(region, "gray")
# Assign icon based on administrative type
if admin_type == "Arrondissement":
icon_name = "building" # Urban
elif admin_type == "Municipalité":
icon_name = "home" # Rural
else:
icon_name = "info-sign" # Default
# Add marker with detailed popup
folium.Marker(
location=coord,
popup=f"{hospital_name}<br>Commune: {commune}<br>Région: {region}<br>Catégorie: {category} - {signification}<br>Type: {admin_type}",
icon=folium.Icon(color=color, icon=icon_name, prefix="fa")
).add_to(m)
# Step 7: Add a legend
legend_html = '''
<div style="position: fixed; bottom: 50px; left: 50px; z-index: 1000; padding: 10px; background-color: white; border: 2px solid gray; border-radius: 5px;">
<h4>Legend</h4>
<h5>Regions</h5>
<ul>
{}
</ul>
<h5>Administrative Type</h5>
<ul>
<li><i class="fa fa-building" style="color: black;"></i> Arrondissement (Urban)</li>
<li><i class="fa fa-home" style="color: black;"></i> Municipalité (Rural)</li>
</ul>
</div>
'''.format(''.join([f'<li><span style="color: {color};">■</span> {region}</li>' for region, color in region_colors.items()]))
# Add Font Awesome for icons
m.get_root().html.add_child(folium.Element(
'<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.15.3/css/all.min.css">'
))
# Add the legend to the map
m.get_root().html.add_child(folium.Element(legend_html))
# Step 8: Save the map
m.save("hospital_map.html")
print("Map saved as 'hospital_map.html'")
else:
print("No hospitals were geocoded successfully.")
Couldn’t find coordinates for Sidi Moussa Lemhaya, Morocco Using region fallback for Sidi Moussa Lemhaya: (33.38027961774928, -2.4998206318442966) Map saved as 'hospital_map.html'
In [512]:
m
Out[512]:
Make this Notebook Trusted to load map: File -> Trust Notebook
In [ ]: